AITopics | bandit game

Collaborating Authors

bandit game

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d82118376df344b0010f53909b961db3-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 10:06:32 GMT

bandit game, br 3, nulla, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Government (1.00)
Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

d82118376df344b0010f53909b961db3-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 10:06:28 GMT

algorithm, bandit game, stackelberg equilibrium, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Asia > Middle East > Jordan (0.04)

Industry:

Government (1.00)
Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Neural Information Processing SystemsDec-25-2025, 00:17:35 GMT

Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is . The majority of existing results in this field focuses on either symmetric solution concepts (e.g.

artificial intelligence, equilibrium, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.56)

Add feedback

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Neural Information Processing SystemsAug-17-2025, 18:06:57 GMT

machine learning, nulla, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Government (1.00)
Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Neural Information Processing SystemsAug-17-2025, 18:06:53 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Asia > Middle East > Jordan (0.04)

Industry:

Government (1.00)
Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Neural Information Processing SystemsJan-19-2025, 09:01:50 GMT

Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum. The majority of existing results in this field focuses on either symmetric solution concepts (e.g. It remains open how to learn the Stackelberg equilibrium---an asymmetric analog of the Nash equilibrium---in general-sum games efficiently from noisy samples. This paper initiates the theoretical study of sample-efficient learning of the Stackelberg equilibrium, in the bandit feedback setting where we only observe noisy samples of the reward. We consider three representative two-player general-sum games: bandit games, bandit-reinforcement learning (bandit-RL) games, and linear bandit games.

equilibrium, sample-efficient learning, stackelberg equilibrium, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Incentivized Learning in Principal-Agent Bandit Games

Scheid, Antoine, Tiapkin, Daniil, Boursier, Etienne, Capitaine, Aymeric, Mhamdi, El Mahdi El, Moulines, Eric, Jordan, Michael I., Durmus, Alain

arXiv.org Machine LearningMar-6-2024

Real-world decision-making problems, however, often present challenges that are not addressed in this simple This work considers a repeated principal-agent optimization framework. These include the challenge of bandit game, where the principal can only scarcity when there are multiple decision-makers, issues interact with her environment through the agent. of misaligned objectives, and problems arising from The principal and the agent have misaligned information asymmetries and signaling. The economics objectives and the choice of action is only left to literature addresses these issues through the design of the agent. However, the principal can influence game-theoretic mechanisms, including auctions and the agent's decisions by offering incentives which contracts (see, e.g., Myerson, 1989; Laffont & Martimort, add up to his rewards. The principal aims to 2009), aiming to achieve favorable outcomes despite agents' iteratively learn an incentive policy to maximize self-interest and limited information set.

agent, algorithm, log 2, (13 more...)

arXiv.org Machine Learning

2403.03811

Country:

Europe > Kosovo > District of Gjilan > Kamenica (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Optimistic Thompson Sampling for No-Regret Learning in Unknown Games

Li, Yingru, Liu, Liangqi, Pu, Wenqiang, Liang, Hao, Luo, Zhi-Quan

arXiv.org Machine LearningFeb-24-2024

This work tackles the complexities of multi-player scenarios in \emph{unknown games}, where the primary challenge lies in navigating the uncertainty of the environment through bandit feedback alongside strategic decision-making. We introduce Thompson Sampling (TS)-based algorithms that exploit the information of opponents' actions and reward structures, leading to a substantial reduction in experimental budgets -- achieving over tenfold improvements compared to conventional approaches. Notably, our algorithms demonstrate that, given specific reward structures, the regret bound depends logarithmically on the total action space, significantly alleviating the curse of multi-player. Furthermore, we unveil the \emph{Optimism-then-NoRegret} (OTN) framework, a pioneering methodology that seamlessly incorporates our advancements with established algorithms, showcasing its utility in practical scenarios such as traffic routing and radar sensing in the real world.

adversarial regret, algorithm, opponent, (15 more...)

arXiv.org Machine Learning

2402.09456

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

When is Offline Two-Player Zero-Sum Markov Game Solvable?

Cui, Qiwen, Du, Simon S.

arXiv.org Artificial IntelligenceJan-10-2022

We study what dataset assumption permits solving offline two-player zero-sum Markov game. In stark contrast to the offline single-agent Markov decision process, we show that the single strategy concentration assumption is insufficient for learning the Nash equilibrium (NE) strategy in offline two-player zero-sum Markov games. On the other hand, we propose a new assumption named unilateral concentration and design a pessimism-type algorithm that is provably efficient under this assumption. In addition, we show that the unilateral concentration assumption is necessary for learning an NE strategy. Furthermore, our algorithm can achieve minimax sample complexity without any modification for two widely studied settings: dataset with uniform concentration assumption and turn-based Markov game. Our work serves as an important initial step towards understanding offline multi-agent reinforcement learning.

assumption, assumption 2, markov game, (15 more...)

arXiv.org Artificial Intelligence

2201.03522

Country: Europe > United Kingdom > England > Greater London > London (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback